Back

Virus Evolution

26 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
SARS-CoV-2 genome surveillance in Mainz, Germany, reveals convergent origin of the N501Y spike mutation in a hospital setting
2021-02-12 infectious diseases 10.1101/2021.02.11.21251324
#1 (8.3%)
Show abstract

While establishing a regional SARS-Cov-2 variant surveillance by genome sequencing, we have identified three infected individuals in a clinical setting (two long-term hospitalized patients and a nurse) that shared the spike N501Y mutation within a genotype background distinct from the current viral variants of concern. We suggest that the adaptive N501Y mutation, known to increase SARS-CoV-2 transmissibility, arose by convergent evolution around December in Mainz, Germany. Hospitalized patients ...

2
Population genomic insights into the evolution of the SARS-CoV-2 Omicron variant
2022-06-27 epidemiology 10.1101/2022.06.27.22276933
#1 (7.4%)
Show abstract

A thorough understanding of the patterns of population subdivision of a pathogen can prevent disease spread. For SARS-CoV-2, the availability of millions of genomes makes this task analytically challenging. Our study used population genomic methods and identified subtle subdivisions within the Omicron variant, in addition to that captured by the Pango lineage. Further, some of the identified clusters of the Omicron variant revealed statistically significant signatures of selection or expansion r...

3
A highly divergent SARS-CoV-2 lineage B.1.1 sample in a patient with long-term COVID-19
2023-09-17 infectious diseases 10.1101/2023.09.14.23295379
#1 (7.1%)
Show abstract

We report the genomic analysis of a highly divergent SARS-CoV-2 sample obtained in October 2022 from an HIV+ patient with presumably long-term COVID-19 infection. Phylogenetic analysis indicates that the sample is characterized by a gain of 89 mutations since divergence from its nearest sequenced neighbor, which had been collected in September 2020 and belongs to the B.1.1 lineage, largely extinct in 2022. 33 of these mutations were coding and occurred in the Spike protein. Of these, 17 are line...

4
L18F substrain of SARS-CoV-2 VOC-202012/01 is rapidly spreading in England
2021-02-09 epidemiology 10.1101/2021.02.07.21251262
#1 (6.1%)
Show abstract

The Variant of Concern (VOC)-202012/01 (also known as B.1.1.7) is a rapidly growing lineage of SARS-CoV-2. In January 2021, VOC-202012/01 constituted about 80% of SARS-CoV-2 genomes sequenced in England and was present in 27 out of 29 countries that reported at least 50 viral genomes. As this strain will likely spread globally towards fixation, it is important to monitor its molecular evolution. Based on GISAID data we systematically estimated growth rates of mutations acquired by the VOC lineag...

5
Predictive Modeling of COVID-19 Variant Peak Prevalence and Duration Using GISAID Data Across 15 Countries
2026-02-05 infectious diseases 10.64898/2026.02.04.26345559
#1 (6.1%)
Show abstract

BackgroundRapid emergence and replacement of SARS-CoV-2 variants underscore the need for early and reliable indicators of variant dominance to guide timely public health response. However, early genomic trajectories are typically short, sparse, and noisy, with strong fluctuations and substantial cross-country heterogeneity in sequencing intensity and reporting. MethodsWe develop a scalable forecasting framework that predicts whether new variants will reach high prevalence and how long they will...

6
Single genome amplification and molecular cloning of HIV-1 populations in acute HIV-1 infection: implications for studies on HIV-1 diversity and evolutionary rate
2025-03-17 hiv aids 10.1101/2025.03.17.25324117
#1 (6.1%)
Show abstract

BackgroundHuman immunodeficiency virus type 1 (HIV-1) is one of the fastest evolving human pathogens. Understanding transmission, within-host adaptation, and evolutionary dynamics are pivotal for development of interventions and vaccines. HIV-1 infection is generally caused by one single transmitted founder virus (TFV), and TFV sequences have typically been obtained using single genome amplification (SGA). However, suboptimal sample quality can result in sequencing failures, representing non-tri...

7
Emergence of SARS-CoV-2 stains harbouring the signature mutations of both A2a and A3 clade
2021-02-08 epidemiology 10.1101/2021.02.04.21251117
#1 (6.0%)
Show abstract

SARS-CoV-2 strains with both high transmissibility and potential to cause asymptomatic infection is expected to gain selective advantage over other circulating strains having either high transmissibility or ability to trigger asymptomatic infection. The D614G mutation in spike glycoprotein, the characteristic mutation A2a clade, has been associated with high transmissibility, whereas the A3 clade specific mutation L37F in NSP6 protein has been linked with asymptomatic infection. In this study, w...

8
Shared within-host SARS-CoV-2 variation in households
2022-05-27 epidemiology 10.1101/2022.05.26.22275279
#1 (5.9%)
Show abstract

BackgroundThe limited variation observed among SARS-CoV-2 consensus sequences makes it difficult to reconstruct transmission linkages in outbreak settings. Previous studies have recovered variation within individual SARS-CoV-2 infections but have not yet measured the informativeness of within-host variation for transmission inference. MethodsWe performed tiled amplicon sequencing on 307 SARS-CoV-2 samples from four prospective studies and combined sequence data with household membership data, a...

9
Genomic epidemiology of SARS-CoV-2 in Russia reveals recurring cross-border transmission throughout 2020
2021-04-06 epidemiology 10.1101/2021.03.31.21254115
#1 (5.9%)
Show abstract

SARS-CoV-2 has spread rapidly across the globe, with most nations failing to prevent or substantially delay its introduction. While many countries have imposed some limitations on trans-border passenger traffic, the effect of these measures on the spread of COVID-19 strains remains unclear. Here, we report an analysis of whole-genome sequencing of 3206 SARS-CoV-2 samples from 78 regions of Russia covering the period between March and November 2020. We describe recurring imports of multiple COVID...

10
Operationalizing genomic epidemiology during the Nord-Kivu Ebola outbreak, Democratic Republic of the Congo.
2020-06-09 epidemiology 10.1101/2020.06.08.20125567
#1 (5.6%)
Show abstract

The Democratic Republic of the Congo declared its tenth Ebola virus disease outbreak in July 2018, which has circulated primarily in the Nord Kivu province. In addition to standard epidemiologic surveillance and response efforts, the Institut National de Recherche Biomedicale implemented an end-to-end genomic surveillance system, including sequencing, bioinformatic analysis, and dissemination of genomic epidemiologic results to frontline public health workers. Here we report 538 new genomes from...

11
Emergence of N antigen SARS-CoV-2 genetic variants escaping detection of antigenic tests
2021-03-26 infectious diseases 10.1101/2021.03.25.21253802
#1 (5.4%)
Show abstract

SARS-CoV-2 genetic variants are emerging as a major threat to vaccination efforts worldwide as they may increase virus transmission rate and/or confer the ability to escape vaccine induced immunity with knock on effects on the level of herd immunity and vaccine efficacy respectively. These variants concern the Spike protein, which is encoded by the S gene, involved in virus entry into host cells and the major target of vaccine development. We report here that genetic variants of the N gene can i...

12
COVID-19 pandemic re-shaped the global dispersal of seasonal influenza viruses
2023-12-24 infectious diseases 10.1101/2023.12.20.23300299
#1 (5.2%)
Show abstract

Understanding how the global dispersal patterns of seasonal influenza viruses were perturbed during and after the COVID-19 pandemic is needed to inform influenza intervention and vaccination strategies in the post-pandemic period. Although global human mobility has been identified as a key driver of influenza dispersal1, alongside climatic and evolutionary factors2,3, the impact of international travel restrictions on global influenza transmission and recovery remains unknown. Here we combine mo...

13
Evaluation of population immunity against SARS-CoV-2 variants, EG.5.1, FY.4, BA.2.86, JN.1, and JN.1.4, using samples from two health demographic surveillance systems in Kenya.
2024-06-26 infectious diseases 10.1101/2024.06.26.24309525
#1 (5.2%)
Show abstract

Increased immune evasion by emerging and highly mutated SARS-CoV-2 variants is a key challenge to the control of COVID-19. The majority of these mutations mainly target the spike protein, allowing the new variants to escape the immunity previously raised by vaccination and/or infection by earlier variants of SARS-CoV-2. In this study, we investigated the neutralizing capacity of antibodies against emerging variants of interest circulating between May 2023 and March 2024 using sera from represent...

14
The genetic diversity of Nipah virus across spatial scales
2023-07-16 epidemiology 10.1101/2023.07.14.23292668
#1 (5.0%)
Show abstract

Nipah virus (NiV), a highly lethal virus in humans, circulates silently in Pteropus bats throughout South and Southeast Asia. Difficulty in obtaining genomes from bats means we have a poor understanding of NiV diversity, including how many lineages circulate within a roost and the spread of NiV over increasing spatial scales. Here we develop phylogenetic approaches applied to the most comprehensive collection of genomes to date (N=257, 175 from bats, 73 from humans) from six countries over 22 ye...

15
Genomic epidemiology and evolutionary dynamics of respiratory syncytial virus group B in Kilifi, Kenya, 2015-17
2020-03-12 epidemiology 10.1101/2020.03.08.20032920
#1 (4.9%)
Show abstract

Respiratory syncytial virus (RSV) circulates worldwide and is a leading cause of acute respiratory illness in young children. There is paucity of genomic data from purposively sampled populations by which to investigate evolutionary dynamics and transmission patterns of RSV. Here we present an analysis of 295 RSV group B genomes from Kilifi, coastal Kenya, sampled from individuals seeking outpatient care in 9 health facilities across a defined geographical area (890 km2), over 2 RSV epidemics be...

16
Using Genome Sequence Data to Predict SARS-CoV-2 Detection Cycle Threshold Values
2022-11-15 epidemiology 10.1101/2022.11.14.22282297
#1 (4.9%)
Show abstract

The continuing emergence of SARS-CoV-2 variants of concern (VOCs) presents a serious public health threat, exacerbating the effects of the COVID19 pandemic. Although millions of genomes have been deposited in public archives since the start of the pandemic, predicting SARS-CoV-2 clinical characteristics from the genome sequence remains challenging. In this study, we used a collection of over 29,000 high quality SARS-CoV-2 genomes to build machine learning models for predicting clinical detection...

17
HIV-DRIVES: HIV Drug Resistance Identification, Variant Evaluation, & Surveillance Pipeline
2023-10-02 hiv aids 10.1101/2023.09.30.23296350
#1 (4.9%)
Show abstract

The global prevalence of resistance to the Human Immunodeficiency Virus (HIV) combined antiretroviral therapy (cART) emphasizes the need to continuous monitoring to better understand the dynamics of drug-resistant mutations to guide treatment optimization and patient management as well as check the spread of resistant viral strains. We have recently, integrated next-generation sequencing (NGS) into routine HIV drug resistance (HIVDR) monitoring, with key challenges in the bioinformatic analysis ...

18
Tracing the evolutionary path of the CCR5delta32 deletion via ancient and modern genomes
2023-06-20 genetic and genomic medicine 10.1101/2023.06.15.23290026
#1 (4.9%)
Show abstract

The chemokine receptor variant CCR5delta32 is linked to HIV-1 infection resistance and other pathological conditions. In European populations, the allele frequency ranges from 10-16%, and its evolution has been extensively debated throughout the years. We provide a detailed perspective of the evolutionary history of the deletion through time and space. We discovered that the CCR5delta32 allele arose on a pre-existing haplotype consisting of 84 variants. Using this information, we developed a hap...

19
Effects of genomic recombination on SARS-CoV-2 evolution and the growth of the recombinant variant XFG in Germany
2025-11-17 epidemiology 10.1101/2025.11.17.25339653
#1 (4.8%)
Show abstract

In recent several years, multiple predominant SARS-CoV-2 variants that spread worldwide were derived from genomic recombination between SARS-CoV-2 lineages with diversified genetic backgrounds. However, the current understanding about the effects of recombination on SARS-CoV-2 evolution and functional aspects is limited. In this study, to achieve one overview regarding the evolution of SARS-CoV-2 recombinant variants, phylogenetic analyses have been performed to evaluate the divergence of repre...

20
Synonymous substitution rate slowdown preceding the emergence of SARS-CoV-2 variants and during persistent infections
2026-01-28 epidemiology 10.64898/2026.01.26.26344861
#1 (4.8%)
Show abstract

The emergence of variants has shaped the COVID-19 pandemic. The lack of directly observed precursors to these variants has led to proposals that variants emerge from either persistent infections, transmission in non-human animal populations after reverse-zoonosis, or cryptic transmission in the human population. We investigated the origin of variants by analyzing the molecular clock and rate of nonsynonymous and synonymous substitutions in SARS-CoV-2 circulating in human population, persistently...